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ABSTRACT 

Let c > be a constant, and $ be a random Horn formula with n variables and m = c-2 n 
clauses, chosen uniformly at random (with repetition) from the set of all nonempty Horn 
clauses in the given variables. By analyzing PUR, a natural implementation of positive 
unit resolution, we show that lim n ^oo Pr($ is satisfiable) = 1 — F(e~ c ), where F(x) = 
(1 - x)(l - x 2 )(l - x 4 )(l - x & ) ■ ■ Our method also yields as a byproduct an average-case 
analysis of this algorithm. 



1. INTRODUCTION 



Phase transitions in combinatorial problems were first displayed in the seminal work 
of Erdos and Renyi [ER60] on random graphs. Working with the constant proba- 
bility model G(n,p) they showed that the probability that the graph has a "large" 
connected component exhibits a sharp increase at some "threshold" value of p. The 
empirical observation from [CKT91], that for a number of NP-complete problems 
the "hardest on the average" instances are located near such threshold points has 
attracted considerable interest in such threshold phenomena from several communi- 
ties, such as Theory of Computing, Artificial Intelligence and Statistical Mechanics. 
Recent studies [MZK + 99a, MZK + 99b] have provided further evidence that (at least 
some) phase transitions have indeed an impact on algorithmic complexity, and have 
offered additional insight on the cases when this happens. 



It turns out that there are two different notions of phase transition in a combi- 
natorial problem P. One definition applies to optimization problems and directly 
parallels the approach from Statistical Mechanics. Potential solutions for an in- 
stance of P are viewed as "states" of a system. One defines an abstract Hamilto- 
nian (energy) function, that measures the "quality" of a given solution, and apply 
methods from the theory of spin glasses [MPV87] to make predictions on the typ- 
ical structure of optimal solutions. In this setting a phase transition is defined as 
non-analytical behavior of a certain "order parameter" called free energy, and a 
discontinuity in this parameter, manifest by the sudden emergence of a backbone 
of constrained "degrees of freedom" [MZK + 99a] is responsible for the exponential 
slow-down of many natural algorithms. 

The second definition is combinatorial and pertains to decision problems. It 
is the concept of threshold property from random graph theory, more precisely a 
restricted version of this notion, called sharp satisfiability threshold. A satisfiability 
threshold always exists for monotone problems [BT86], but may or may not be 
sharp (we speak of a coarse threshold in the latter case). It is this notion of phase 
transition that we are concerned with in this paper. 

From the practical perspective of [CKT91] phase transitions are most appealing 
in problems that are thought to be "hard" , in particular, in NP-complete problems. 
Therefore a lot of recent work has been directed towards locating phase transitions 
in such problems. In some cases, the most proeminent of which is Hamiltonian 
cycle [KS83]), a complete analysis has been obtained. In other (e.g., 3-SAT [FS96, 
KKKS97, Ach, JSV] and graph-coloring [Chv91, AM97]), obtaining such an analysis 
is hard, and indeed not yet accomplished task: for these problems there exists a 
fairly large gap between the best rigorous lower and upper bounds, and the methods 
that were used to obtain these bounds do not seem to be capable to yield a tight 
analysis. 

Understanding the reasons that make problems with similar computational com- 
plexity differ so much with respect to their "mathematical tractability" is clearly 
a topic worth investigating. A natural intuitive explanation of this discrepancy is 
that problems that are easy to analyze "coincide with high probability" with prob- 
lems with a simple "local" structure, while problems that are "hard to analyze" 
lack such an approximation. Such is the case, for instance, of the above mentioned 
Hamiltonian cycle, that "coincides with high probability" with the graph prop- 
erty "having minimum degree two" [AKS85]. Support in favor of this intuition 
also comes from Friedgut's result on the existence of a sharp threshold for 3-SAT 
[Fri99]: his proof relies on showing that problems with coarse thresholds can be 
well approximated by some simple "local" property, and then proving that 3-SAT 
lacks such an approximation. While his result sheds no light on the "mathematical 
tractability" of Hamiltonian cycle, it is tempting to speculate that there might be a 
suitable generalization of the concept of "coarse threshold" , that 3-SAT still lacks, 
and that encompasses all known "mathematically tractable cases" . 

A natural testbed for the above intuition is the case of polynomial time solv- 
able problems. In these cases the hypothesis predicts that one should be able to 
obtain a complete analysis: often tractability arises from the existence of a "local" 



characterization, that circumvents the need for exhaustively searching the exponen- 
tially large space of potential solutions. Another reason is methodological: studying 
tractable problems usually amounts to probabilistic analyses of decision algorithms 
for these problems using a methodology based on Markov chains, a task that can 
often be accomplished. 

Such an approach was successful for some tractable versions of propositional 
satisfiability: out of the six maximally tractable cases of SAT that Schaefer iden- 
tified in his celebrated Dichotomy Theorem [Sch78], two are trivially satisfiablc 
and two have completely analyzed phase transitions. The transition for 2-SAT, the 
satisfiability problem for CNF formulas with clauses of size two, has been stud- 
ied in [CR92, Goe96] and that for XOR-SAT, the satisfiability problem for linear 
systems of equations with boolean variables, has been studied in [CD96]). The 
remaining two cases are the Horn formulas and the negative Horn formulas (which 
are, of course, dual) . 

In this paper we deal with these two cases. Unlike the other two nontrivial 
cases, we show that Horn satisfiability has a coarse threshold. In the "critical 
region" the number of clauses is exponential in the number of variables, hence 
from a practical perspective, our results show that if do not restrict clause length, 
random Horn formulas of practical interest are almost certainly satisfiable (we have 
subsequently analyzed the bounded clause length case in [1st]). Also, we obtain our 
result by modeling PUR, a natural implementation of positive unit resolution, by 
a Markov chain, and our method yields as a byproduct an average-case analysis of 
this algorithm. 

2. RESULTS 

A Horn clause is a disjunction of literals containing at most one positive literal. 
It will be called positive if it contains a positive literal and negative otherwise. 
A Horn formula is a conjunction of Horn clauses. Horn satisfiability (denoted 
by HORN-SAT) is the problem of deciding whether a given Horn formula has a 
satisfying assignment. 

Since our main interest is in phase transitions in decision problems in the class 
NP, we will discuss the notion of satisfiability threshold in the framework of NP- 
decision problems. Our definition is slightly different from the standard one (e.g. 
[Pap94]), and accommodates the fact that legal encodings of instances of a problem 
have in general lengths from a restricted set of values. 

Definition. An NP-decision problem is a five-tuple P = (E,_D, /, g) such that 

1. E is a finite alphabet. 

2. f, g : N — > N are polynomial time computable, polynomially bounded func- 
tions. In addition f has range {0,1}. A length n is called admissible if 
/(«) = 1. 

3. D C E* x E* is a polynomial time computable relation. 



4. for every pair (x, y) G X* x £*, i/ (x, y) £ D then the length of x is acceptable 
and [\y\ < g(\x\)]. 

A string x having an admissible length will be called an instance of P. A string 
y such that (x, y) G D is called a witness for x, and we write x G P to state the fact 
that there exists a witness for the instance x. Finally problem P is monotonically 
decreasing if for every instance x of P and every witness y for x, y is a witness 
for every instance z obtained by turning some bits of x from 1 to 0. Monotonically 
increasing problems can be similarly defined. 

The three standard probabilistic models from random graph theory [Bol85] , the 
constant probability model, the counting model, the multiset model extend directly 
to any NP-decision problem, and are equivalent under fairly liberal conditions. For 
the purposes of this paper we recall the definition of the multiset model: 

Definition. Let P be an NP-decision problem The random multiset model 
Q(n, m) has two parameters, an admissible length n and an instance density 1 < 
m < n. A random sample x from f2(n, m) is an instance of P obtained by first 
setting x = 0™, then choosing, uniformly at random and with repetition, m bits of 
x and switching them to 1. 

Next we define out threshold properties for monotonically decreasing problems 
under the multiset model. Similar definitions can be given for monotonically in- 
creasing problems, or when using one of the two other random models. 

Definition. Let P be any monotonically decreasing decision problem under the 
multiset random model Q(n,m). A function 9 is a threshold function for P if 
for every function m, defined on the set of admissible instances and taking integer 
values, we have 

1. ifm(ri) = o(0(n)) then lim^oo Pr^n^m) [x G P] = 1, and 

2. ifm(n) = ui(6{n)) then lim„^oo Pr x£ n ( „ im ) [x G P] = 0, 

9 is called a sharp threshold if in addition the following property holds: 

3. For every e > define the two functions fii (n), /U 2 (n) by 

^(n) = min{m G N : Pr xen(rhm) {x G P] < 1 - e}, 



/u 2 (n) = min{m G N : Pr a . en( „ iTO) [x G P] < e}. 

Then we have 

lim n ^ J 2{n l-^ n) =0. 
6{n) 

If, on the other hand, for some e > the amount f2("_)-Mi(") j s ooun d e d away 
from as n ► oo ? 6 is called a coarse threshold. These two cases are not exhaustive 



Procedure PUR($) 

if ($ contains no positive unit clauses) 
return TRUE 

else 

choose a random positive unit clause x 
if ($ contains the clause x) 
return FALSE 

else 

let $ be the formula obtained 
by setting i to 1 in $ 
return PUR(<f>'). 



Fig. 1. Algorithm PUR 

as the above quantity could in principle oscillate with n. Nevertheless they are so 
for most "natural" problems. 

A useful modification of the above framework has the set of admissible lengths 
specified by an increasing function N : N — > N. We correspondingly redefine the 
random model as fl(n, m) = n(N(n),m) and the threshold function by 6(n) = 
6{N{n)). Such will be the case of random Horn satisfiability, for which a random 
formula from Q(n, m) is obtained by choosing m clauses independently, uniformly 
at random and with repetition from the set of all N(n) = (n + 2) • 2™ — 1 Horn 
clauses over variables x\, . . . , x n . 

The following is our main result: 

Theorem 2.1. 9{n) = 2" is a threshold function for random Horn satisfiability. 
Moreover, for every constant c > 

lim Pr $eS1( „ c .2")[3> is satisfiable] = 1 - F(e~ c ), (2.1) 

where 

F(x) = (1 - x){l - x 2 )(l -x 4 )---(l-x 2k )---. 

The result makes clear that random Horn satisfiability has a coarse threshold. 

The algorithm PUR, employed in the proof Theorem 2.1 is displayed in Fig. 1. 
PUR is a natural implementation of positive unit resolution, which is complete for 
HORN-SAT [HW74]. 

As a byproduct, our analysis yields the following two results, which provide an 
average-case analysis of PUR: 



Theorem 2.2. Let X n <G [0,n] be the r.v. denoting the number of iterations of 
PUR on a random satisfiable formula $ G f2(n,c • 2"). TTien X„ converges in 
distribution to a distribution p on [0, n] having support on the nonnegative integers, 
P = (Pk)k>o, Pk = Proh[p = k], given by 

-2 k c k ' 1 

no -«-*•)• 

x ' i—l 

The case of unsatisfiable formulas displays one feature not present in the previous 
result: fluctuations due to the nature of the binary expansion of n, wobbles in the 
terminology of P. Flajolet [Fla]. 



Theorem 2.3. Let Y n be the r.v. denoting the number of iterations of PUR on 
a random formula $ € fi(n, c • 2 n ), and, for k <G [0, n], possibly a function of n, let 
r] ni k be the probability that Y n = \ \og 2 n\ + k, conditional on $ being unsatisfiable. 
Then 



linin^oo \k — log 2 (n)| = oo implies that linin^oo r\ n ^ = 
for every k G Z 

Vn,k = G(k - 1, c„) - G(k, c„) + o(l), 

where 

G(k,c) = e ~ c( ^=-oo 23 \ 
c 

c, 



2{io g2 (v^)} ' 



3. NOTATION AND USEFUL RESULTS 



For n € N and < p < 1, we denote by B(n,p) a random variable having a 
Bernoulli distribution with parameters n,p. For AeR, Po(X) will denote a Poisson 
distribution with expected value A. 

We will use "with high probability" (w.h.p.) as a substitute for "with probability 
1— o(l)." We also say that a sequence (p n )neN of real numbers is exponentially small 
(written o(l/poly)) if for every polynomial Q, p n = o(l/Q(n)). We will measure, 
as usual, the distance between two probability distributions with integer values 
P = (pi) and Q — (%) by their total variation distance d TV (P, Q) — \ - J2 t \p% — q%\, 
and recall the following inequalities from [Shc84] and [BHJ92] (page 2 and Remark 
1.4): 



Lemma 3.1. If n,p, A, n > then 



f 3p 

d T v(B(n,p),Po(np)) <m\nlnp 2 ,— 



d T v(Po(X),Po( i i)) < |a* — Aj. 

Definition. Given two probability distributions D and D' , we say that D' stochas- 
tically dominates D if for every x, Pr[D > x] < Pi[D' > x], and write D < D' 
when this holds. 

The following are two conditional probability tricks. 

Fact 3.1. Let A n ,B n! and C n be events such that Pi[C n \B n ] = 1 — o(l). Then 
\Pv[A n \B n ] - Pv[A n \B n A C n ] | = o(l). 

Proof. 

Applying the chain rule for conditional probability we get 

\Pv[A n \B n ]-Pr[A n \B n AC n ]| = 
\Pr[A n \B n A C n ] ■ Pr[C n |£ n ] + Pi[A n \B n A Cn\ ■ Vr\CZ\B n ] - Pr[A n \B n A C n ]\ - 
\Pv[A n \B n A C n ] ■ (1 - o(l)) + Pr[A n \B n ACQ ■ o(l) - Pr[,4„|B„ A C n ]\ = o(l). 



Fact 3.2. If B is a random variable taking integer values in the interval I, then 
for every event A, 

min{PrL4|(B = A)]} < PrL4] < max{PrL4|(B = A)]}. 

Several "concentration of measure" results will be used in the sequel. They 
include: 

Proposition 3.2. (Chernoff bound) Let X\, . . . ,X n be independent 0/1 random 
variables with Pr(Xi = 1) = p. Let X = X\ + . . . +X n , jj, = E[X] and 6 > 0. Then 



Pr[\X-fi\ >S-^}< 
A related inequality from [AES92] is: 



Proposition 3.3. Let P have Poisson distribution with mean /i. For e > 0, 

Pr[P < ft ■ (1 - e)] < e^ 2 , 

Pr[P > n ■ (1 + e)] < [e e (l + e)-( 1+e ^. 

We regard the algorithm PUR as working in stages, indexed by the number of 
variables still left unassigned; thus the stage number decreases as PUR moves on. 
Let $ denote an input formula over n variables. For i, 1 < i < n, Ai, Ri, and 
Si respectively denote the event that PUR accepts at stage i, the event that PUR 
rejects at stage i, and the event that PUR reaches stage i — 1 ("survives stage i"). 
Also, <&i denotes the 3> at the beginning of stage i, Ni denotes the number of clauses 
of <I>i, HP\ t i the number of positive unit clauses of $i, i?P 2 ,i the number of positive 
non-unit clauses, HN\^ the number of negative unit clauses and HN2A the number 
of negative non-unit clauses. Finally, for simplicity define II = F(e~ c ) and Ilj to 
be the product of the first i terms from IT. 

We will assert stochastic domination via couplings of Markov chains (for an 
extensive treatment see [Lin92]). The framework needed for our coupling result is 
made precise in the following definitions (especially tailored for the context of this 
paper, rather than being standard). 

Definition. Let (X n ) be a Markov chain having state space S and transition 
matrix X. A stopping rule H for X n is a set H of transitions of (X n ) (i.e. pairs 
of states € S x S such that Xij > 0). 

Intuition: We will use stopping rules H to talk about the probability (de- 
noted Pr[A|if]) of properties A of the Markov chain that only hold conditional on 
(X n ) making only transitions from H. 

Definition. Let X t = (Xo. t ,X t ) and Y t = (Yo. t ,Y t ) be two Markov chains on 
Z x Z d having transition matrices X, Y, respectively. Let Hi, H2 be two stop- 
ping rules for (X n ), (Y n ), respectively. Let e B C {0, . . . , d}. A (B,H 1 ,H 2 )- 
majorizing (Markovian) coupling of X and Y is a Markov chain Z = (^t, 1,^,2)) 
on (Z x Z d ) 2 , Z t .i = (Z tfi i, . . . , Z ttd i), Z t .2 = (Z tfi2 , Z t . d2 ), having transition 
matrix {Z^j)^i)) itj ^i e7l d+-L such that: 

• for every i,j G Z d+1 , Pi[Z t+1 . 1 ] - j\Z tA = i] = X itj , 

• for every i,j E Z d+1 , Pr[Z t+1 , 2 ] - j\Z t . 2 = i] = Y i<jt 

• for every i,j,k,l € Z d+1 , if Zu,j),(k,i) > and (i,k) G Hi then (j,l) G H 2 . 

• for every t>0 and every state (Z t: i 7 Z tt2 ) of Z t reachable through moves in 
Hi x (Z d+1 ) 2 only, we have 

Z t ,u = Z t .i2 for all i G B, 



and 



Intuition: The first two conditions express the fact that the coupling is 
Markovian. The third condition (denoted symbolically Hi < H 2 ) relate the two 
stopping rules. Finally, the last condition allows us to compare two quantities of 
interest for the Markov chains (X n ) and (Y n ), namely J2ieB Xi,t and J2ieB ^M- 

Let us now formally state this comparison result. 

Lemma 3.4. Let (Xt), (Yt), Hi, H2, B be as in the previous definition, and 
suppose it is possible to construct a (B, Hi, H2) -majorizing coupling of (Xt) and 
(Y t ). Then, for every a £ Z, 

Pr[^X M > a\Hi] < Pr[^y M > a\H 2 ] 



Proof. 

Define 



Then 



H B ,a = {A = (A , ...,X d ) :^2\>a}. 



ieB 



Pv[X t e H B , a \Hi] = ?i[Xt=x\H 1 ] (3.1) 

x£H B ,a 

XEH B ,a 

= E E Pr ^u^)A(4 2 =?/)|ffix5 2 ] (3.3) 

x£H B ,a yeS 

= E E Pr [( Z M-^A(^2=2/Pixff 2 ] (3.4) 
= E E ?r[(Zt,i=x)A(Z t ,2=y)\HixH 2 }(3.5) 

x&H B<a yeH Bta 

= E E ?r[(Zt,i=x)A(Z t ,2=y)\Hi xH 2 }(3.6) 

< E E Pr ^ 1 = a: ) A ^2=yPixff2] (3.7) 

yeH Bta xes 

^ E E Pr [( Z M=^)A(^,2=y)|^ 2 xff 2 ] (3.8) 

y£H Bia xes 



Procedure PUR 2 ($) 

if ($ contains no positive unit clauses) 
first eliminate a random clause 
then independently, with probability 1/t 
eliminate every remaining clause, 
and continue recursively 

else 

choose a random positive unit clause x 

set i to 1 in f 

and continue recursively 



Fig. 2. Second version of PUR 

= Pr ^2 = y\S 2 x H 2 ] (3.9) 

yeH B , a 

= Pr[Y t £ H B , a \H 2 ]. (3.10) 

Lines 3.2, 3.10 follow from the Markovian character of the coupling. Line 3.4 
follows from Hi < H 2 . The rest are simple arithmetical calculations. 

■ 

The couplings we need are very simple, and employ the following idea: suppose 
the recurrences describing X t+ \ — X t and Y t+ \ — Y t are identical, except for one 
term, which is £?(rai, r) for (X t ) and B{m 2l t) for (Y t ), where mi < m 2 are positive 
integers and r £ (0, 1). Obtain a coupling by identifying B(mi, r) with the outcome 
of the first mi Bernoulli experiments in B(m 2 , r). 

4. THE UNIFORMITY LEMMA 

The crux of our analysis relies on the observation that the behavior of PUR on a 
random Horn instance can be described by a stochastic recurrence (Markov chain) . 

Lemma 4.1. f'The Uniformity Lemma" :) 

1. Suppose PUR does not halt before stage t. Then, conditional on N t , the 
clauses of $ t are random and independent. 

2. Consider PUR 2 , the modified version of the algorithm PUR from Figure 2 
(that does not check for accepting /rejecting, but may produce empty clauses). 



Let Ei represent the number of empty clauses at stage i. Then for every stage 
t, conditional on T t = (HNij, HN^^t, HPi,t, HPi,t, Et) the clauses of $t are 
chosen uniformly at random and are independent. 
3. Consider again the original version of PUR . Suppose now that we condition 
T t and on the fact that <& survives Stage t as well. Then we have 



N t _ 1 =N t -A hP {t)-A 2 ,p{t), (4.1) 

where 

• Ai.p(i), the number of positive clauses that are satisfied at stage t, has 
the distribution 1 + B (HP\ t t — 1, j) ■ 

• A 2 ,p(i), the number of positive non-unit clauses that are satisfied at stage 
t, has the binomial distribution B (i7P 2 ,t, \) ■ 

Proof. 

The proof is based on the method of deferred decisions [KMP90]. The crux of 
this method is to consider the random formula $ as being disclosed gradually as 
the algorithm proceeds, rather than as being completely determined at the very 
beginning of the algorithm. Following a suggestion of Achlioptas [Ach] the process 
can be conveniently imagined as having the occurrences of each literal in the formula 
represented by a card that has the literal as it value. The cards corresponding to 
each clause are arranged in separate piles, and are all initially face down (to reflect 
the fact that initially we don't know anything about the formula). Part of the 
unveiling process will consist of dealing (turning face up) the cards from each pile 
that contain a specific literal. We also assume that (unless other specified by the 
unveiling process) the still undealt parts of each pile is "hidden" , so that we don't 
know its height. 

1. For the first part of the lemma (that conditions only on N t ) the disclosure 
process consists of first unveiling, at each stage greater than t, the location 
of a random positive unit clause of <& t (guaranteed to exist). We fill it with 
a random variable among those left. The process continues by providing 

a. all the occurrences of this variable. 

b. the locations and complete contents of clauses that contain this variable 
in positive form, and 

c. the locations of the clauses that have been completely filled. 

We refer to the clauses in the latter two cases as blocked, since we have 
complete information about them, and they will no longer be involved in the 
unveiling process. 

Suppose PUR arrives at stage f on Then in stages i = n, n— 1, . . . , $j 
should have contained a unit clause consisting of a positive literal but not its 



complement. This information does not condition in any way the structure 
of the clauses of <£>t, that correspond to the non-blocked piles, counted by N t . 
In fact that the only information we have at Stage t about these piles is their 
number N t . 

For each such pile all disclosed literals appear only in negative form, since 
otherwise the clause would have been satisfied and blocked. Hence the resid- 
ual (hidden) part still obeys the Horn restriction. Given the uniformity in the 
choice of the initial clauses of $, it follows that the clauses of $ t are chosen 
uniformly at random (and independently) among all nonempty Horn clauses 
in the remaining variables. 
2. We will prove the result inductively, starting with Stage n (where it certainly 
is true) and working downwards. At each stage, the disclosure process will 
offer some information on the type of the hidden portion of the clause, namely 
whether it is a positive unit, positive non-unit, negative or empty. 

Definition. For notational convenience define pi(t) = \, P2(t) = 2 *-i-i , 
Ps{t) = \, Pi{t) = {2 *Zt-i) ■ 

If HPi t > 0, to carry on the disclosure process: 

a. choose a random positive unit clause, fill it with a random variable x 
among those left, and block. 

b. independently with probability 1/t fill any of the remaining positive unit 
clauses with x and block. 

c. for any positive non-unit clause: 

(i) with probability pi(t) fill one entry of the clause with x, fill the rest 
of the clause with a random, non-empty combination of negated 
remaining literals and block. 

(ii) if the first case did not happen then, with probability p 2 (i) , fill one 
entry with x and set the type of the remaining clause to "positive 
unit" . 

(iii) if the first two cases did not happen then, with probability ps(t), fill 
one entry with x (but do nothing else). 

(iv) otherwise do nothing. 

d. for any negative unit clause: 

(i) with probability p\ (t) fill one entry of the clause with x, set the type 
of the remaining clause to "empty" . 

(ii) otherwise do nothing. 

e. for any negative non-unit clause: 

(i) with probability Pi(t) fill one entry of the clause with x and set the 
type of the remaining clause to "negative unit" . 



(ii) if the first case did not happen then, with probability p 3 (t) , fill one 
entry of the clause with x (but do nothing else). 

(iii) otherwise do nothing. 

In the opposite case, EtP\.t — 0, the disclosure process consists of perform- 
ing the procedure described in the algorithm, and additionally filling every 
eliminating clause with a random Horn clause in the remaining variables that 
is not a positive unit clause. 

By a tedious but straightforward case analysis it is easy to see that in both 
cases the uniformity property carries through to the next stage. The reason is 
that in all cases the only information we disclose about each remaining clause 
is its type, but not its content. Moreover, we get the following recurrences 
for the case HP\ jt > : 



HP ht - 


! = HP ht - 


l-Ai, P (t) + Ai2,p(t) 


HP 2 , t - 


i = HP 2 . t - 


A 2 , P {t) - A 12 , P (t), 


HN 1<t . 


_x = HN ht - 


- A B (t) + Ai2,jv(t), 


HN 2 , t - 


_! = HN 2 . t - 


- Ai2,jv(t), 


Et-i - 


■- Ek + A E \t) 





where 



' A ltP (t) = B(HP ltt -l, Pl (t)), 

A 2 , P (t) = B(HN 2}t , Pl (t)), 
< A 12;P (i) = B (HP 2 . t - A 2 . P {t) lP2 {t)) 

A E (t)=B(HN 1 , uPl (t)), 
, A 12 , N (t)=B(HN 2 , t , P4 (t)). 

3. The conditioning on PUR surviving Stage t implies that up to Stage t—1 the 
algorithm PUR and its modified version PUR 2 work in the same way. With 
respect to PUR 2 it gives us one additional piece of information with respect 
to merely conditioning on T t : that A#(i) = 0. The desired recurrence follows 
from the previous point. 



A. Comments on the Uniformity Lemma 

A few comments on the contents of the uniformity lemma are in order. Although 
(as shown by Lemma 4.1 (i)) it would seem that we can characterize the state of 
PUR at Stage t by a single number, N t , this is not so, for two reasons: 

• first, the above uniformity result is conditional (on PUR surviving Stage 
t + 1) and does not hold throughout the whole evolution of the algorithm. 
For instance it is not true at stages before stage t + 1, since unit clauses that 



Procedure PUR 3 ($) 

if ($ contains no positive unit clauses) 
first eliminate a random clause 
then independently, with probability l/t 
eliminate every remaining clause 
and continue recursively 

else 

first, independently with probability l/t 
eliminate every negative non-unit clause 
then 

choose a random positive unit clause x 

set i to 1 in $ 

and continue recursively 



Fig. 3. Third version of PUR 

are the negation of the variable being set cannot appear. An unconditional 
uniformity result is provided by Lemma 4.1 (ii). However, it applies to a 
modified algorithm, which is no longer complete for HORN-SAT , and cannot 
be used to obtain an exact result (rather than just a lower bound on the 
threshold, as it is done e.g. in [FS96] for fc-SAT). 

• second, as shown by Lemma 4.1 (iii), a stochastic recurrence for N t -\ cannot 
be determined by only using the value of Nt; instead we need additional 
information on the structure of $ t captured by the five-tuple T t . 

Fortunately it is possible to circumvent both these problems. On one hand it 
will turn out that all we need for the analysis is the conditional uniformity result 
(i), as long as we can "control" the value N t . On the other hand, this value can be 
indirectly estimated throughout the "most interesting regime of PUR ". 

B. A coupling result 

The following result makes a first step towards estimating N t , by showing that 
we can "approximate" this value by the value of a Markov chain with a simpler 
structure. The intuitive idea is simple: by Lemma 4.1 (iii) the "net decrease" 
Nt-i — N t is approximately 1 + B{HP\ tt + HP2,t — 1, \) which is intuitively less 
thanl + B(At-l,i). 

Lemma 4.2. Consider the modified version of PUR from Figure 3. Then 



1. Conditional on r[ 2) = (HN[% HN™ , HP™, HP™ , E (2) ) (the same quan- 
tities as in Lemma 4-1 (H); we only use the superscript to indicate the fact 
that we are dealing with a different algorithm) the clauses of $ t denote their 
number by N% ) are uniform and independent. 

2. Define So = {[(a,b,c,d,e) — > (ai, b\, c\, d\, e\)] : (c > 0)&&(ei = 0)}. De- 

(21 

fine the stopping rules H2, H3 for T t , T t to be respectively the set of legal 
transitions ofT t , that are in Sq. Finally, define B = {0, 1,2,3}. 

Then it is possible to construct a (B, H2, H3) -majorizing coupling of the 
Markov chains T t and . 

3. IfHP$ > then = - 1 - Ai ;P (i) - A 2;P (t) - A hN (t) - A 2 , N {t), 
where 



Ai, P (t) = B (HP ht - 1, , 
A 2 ,p(*) = B {HN 2 ,t, |) , 
A ltN (t)=B{HN lit ±), 
A2,jv(i) = B (HNzj, j) ■ 



Consequently, irrespective of the value ofHP^ t , 

nP-n£\^i + b(nP-i,\). 



Proof. 



1. The proof is identical to the one of Lemma 4.1 (ii), and thus omitted. 

2. The intuition behind the definition of the set So is simple, and displays the 

connection with the desired analysis of the algorithm PUR : we restrict the 

(2) 

set of legal transitions of T t , T t ' to those for which HPi t > and E t _\ = 
(in other words those for which PUR survives stage t, and thus works like 
PUR 2 ). 

The coupling can be described in a very intuitive way. Suppose that we 
carry on the disclosure process corresponding to the algorithm PUR2, but 
the blocking of a clause is accomplished by placing a red pebble on the corre- 
sponding pile, rather than physically eliminating it. We modify this process 
to also place, at each stage j such that HP\j > 0, some blue pebbles on the 
piles corresponding to negative non-unit clauses, at follows: each such clause 
that has no pebble on it independently receives a blue pebble with probability 
l/j. It is easy to see that the new pebbling process (red and blue) simulates 
the algorithm PU R3 . The coupling easily follows. 

3. The result follows from point 1, by separately considering the behavior of 
PUR S in the two cases, HP[ 2) t > 0, HP[ 2) t > 0. 



5. THE PROOF OUTLINE 



We will prove only the second part of the theorem, since the first part directly 
follows from it. By the proof of Lemma 4.1 the behavior of the algorithm can be 
described (with the above mentioned caveats) by a stochastic recurrence involving 
N t . Proposition 6.1 below proves the important fact that with high probability N t 
stays close to its expected value, which is N n (l — o(l)) for t = n — (^(n 1 / 2 ). 

So, intuitively, the number of clauses of <E>t stays (almost) the same, while the 
number of variables decreases by one. The net effect of one iteration is thus to 
"double the constant c" . We build the proof on three technical lemmas, Lemmas 6.2, 
6.5, and 6.6. Intuitively, these lemmas show the following: 

• Lemma 6.2 states that with probability 1 — o(l) PUR rejects "in the first 
logn + 9(1) stages" (if at all;we will make this more precise in Theorem 2.3). 

• Lemma 6.5 states that with probability 1 — o(l) PUR does not reject in any 
fixed number of steps. 

• Lemma 6.6 obtains a coarse inequality for the satisfaction probability 

— c/4 

e~ c - o(l) < Pr[$ e HORN-SAT] < - & ^_ c/i + o(l). 

A consequence of this result is that a constant number, say k, of iterations 
"blows up" c so that the resulting constant 2 fc c is so large that <&„-fc is 
unsatisfiable with probability arbitrarily close to 1. 

Next we obtain a relation between the probability that PUR rejects $„ and 
the probability that PUR rejects $„_i (^n-i is defined with probability 1 — o(l) 
in the case when c = 0(1) due to Lemma 6.5): the former is equal to the latter 
multiplied by the probability that PUR survives stage n. This latter term is one 
minus the probability that PUR accepts at stage n, which is asymptotically equal 
e~ c , and minus the probability that PUR rejects at step n, which is o(l) and can be 
asymptotically neglected. Iterating this relation for a large enough (but constant) 
number of steps k that make Pr[$„_ fc is unsatisfiable] "close enough to 1" and the 
partial product II fc "close enough to II" allows us to argue that, for every e > 0, 
the probability that PUR rejects is, for sufficiently large n, within e of the value II 
prescribed by the theorem. 



6. THE KEY LEMMAS 



Proposition 6.1. For every c > and every t, n 
probability that the inequality 



Cy/n <t<n, the conditional 



N n -{n- t) 



1 + 



2(iVn-l) ' 
t 



(6.1) 



holds for all t < j < n, in the event that PUR reaches stage t, is 1 — o(l). 
Proof. 

For ease of notation, define E t to be the event that Relation 6.1 holds, and the 
sequences y t = N n — (n — t) 1 + 2 ^7^ an( ^ Zt = By the Lemma 4.2 (ii) 
and Lemma 3.4 we have: 

Pt[N^ >y t \H 3 ]<Pr[N t >y t \H 2 }. 

But conditioning on H 3 , H 2 is the same thing as conditioning on the algorithms 
not remaining without unit clauses, and not producing empty clauses, in other 
words working like PUR . So 

Pr[E t \S t+1 ) >Pr[7V t (2) >y t \H 3 }. 

H 3 implies that - Nf ] = B(ivj+ ) 1 - 1, ^j) for every j > t. 

So, defining the Markov chain U t by U n — N n and U t — Ut-i = 1 + n t , where 
the rjj are independent variables having the Bernoulli distribution B(Nj — 1, i), it 
follows that 



Pr[Ut > Vt] = Pr[7Vr j > yt\H 3 ] < Pr[^|5 t+ i] (6.2) 

By the Chernoff bound, and reasoning inductively, we infer that with probability 
1 — o(l) we have rjj < 2( ' t/ ^. 1 ^ < 2 ( Af ^~ 1 ) f or every t < j < n. Plugging this 
inequality in the definition of Ut and using equation 6.2 proves the lemma. B 

Lemma 6.2. Let p = p(n) such that limn^oo^ — log 2 n — p(n)] = oo. Then 
Pi[Rp\S p +i}, i.e., the conditional probability that PUR rejects at stage p(n) in the 
event that PUR reaches stage p(n), is 1 — o(l). 

To prove this lemma we need the following trivial combinatorial result: 
Lemma 6.3. 

Let a(n) white balls and b(n) black balls be thrown independently into n bins. 
Pick a random bin among those containing a white ball, and let X n be the event 
that the chosen bin contains a black ball as well.Then Pr[X n ] = 1 — (1 — ^)^™- ) . 
Proof. 

It is easy to see that the bin we choose can be seen as the result of choosing a 
random bin among all n bins. So Pr[X] is simply the probability that a randomly 
chosen bin gets a black ball. But this is 1 — (1 — ^) b ' n ■ 

■ 

Proof of Lemma 6.2: 



Let T denote the event E n A E n _\ A • • • A E p . It follows from Proposition 6.1 
that Pi[T\Sj] = l-o(l). Then, by Fact 3.1, Pr[R p \S p+1 ] = Pr[R p \S p+1 AT] +o(l). 
Since T implies N p e I ~ [y p , z p ], 

Pv[Rp\S p+1 AT}> min{Pr[ J R p |5'p + i A T A (iV p = A)]}. 

AG / 

Thus, the claim holds if we show that max^ 6 / Pr[R p \S p+ i ATA (N p = A)] = o(l). 

Suppose that N p = A, the events T, S p+ i hold, and we further condition on 
the number of negative unit clauses. The event R p can be mapped into X p of the 
previous "balls into bins" experiment, with the positive unit clauses representing 
the white balls, the negative unit clauses being the black balls, and the remaining 
p variables being the bins. 

From Lemma 4.1 it follows that the number of negative unit clauses of $ p 
has a binomial distribution B(X, 7 ^). Since A^y > y P j^ = (1 + o(l))c • 
2 lo g2(«)+p(") = uj(n), it follows easily by the Chernoff bound that with probability 
1 — o(l) the number of both positive and negative unit clauses of $ p is larger than 
Since this amount is ui(n) the claim is a consequence of Lemma 6.3. B 

Proposition 6.4. With probability 1 — o(l) PUR does not reject $ at stage n. 
Proof. 

Let U be the number of unit clauses in <F The variable U has a binomial 
distribution with parameters 2"c and („ +2 2 )2"-i , so it is asymptotically a Poisson 
distribution with parameter 2c. In fact Proposition 3.1 and Proposition 3.3 together 
imply that with probability 1 - o(l), U < 2c(l + n 1/3 ) < 4cn 1/3 . 

Consider the U unit clauses of $ as being balls to be tossed into n bins. The 
probability that two of them end up in the same bin is at most • — , which, 
in view of the above upper bound on U, is o(l). So with probability 1 — o(l) no 
variable appears more than once in a unit clause of $, and thus, PUR does not 
reject. 

Lemma 6.5. For every k > 0, with probability 1 — o(l), PUR does not reject in 

any of the stages n, n — 1, . . . , n — k + 1. 

Proof. 

A simple induction on k, coupled with the fact that, conditioned on N t , <&( is a 
random formula, and Proposition 6.1. B 

Lemma 6.6. For every positive constant c, e~ c — o(l) < Pr[<I> € HORN-SAT] < 

— c /4 

1^7* +°(1). 
Proof. 

Let c > be a constant. 



Pr[$ e HORN-SAT] > Pr[PUR accepts at the first step] 



= Pr[<I> contains no positive unit clauses] 



V (n + 2) • 2™ - lj 

= e ~ (n + 2)-2"-l _ (1) 

> e" c -o(l), 

since ^ Tt+ ^ 2 2 „_ 1 < 1. This proves the lower bound. 

In order to prove the upper bound, define p = log 2 n + log log n, let Y be the 
event "PUR accepts," and let Z the event "PUR stops in at most p iterations." By 
Lemma 6.2, Pr[Z] = 1 - o(l), so Pr[F] < Pt[Y\Z] = o(l). However, given Z, F 
is equivalent to A„ V A 5„) V (A„_ p+ i A 5„ A ■ • • S , n _ p +2). So, by the Bayes 

rule, Pr[y|Z] is at most 

PrL4„] + Pr^-ilS,,] + • • • + PrL4„_ p+1 |S„ A S n -i A • • • A S„_ p+2 ]. 

We cannot apply directly Fact 3.1, because this sum has an unbounded number of 
terms. Instead, we will use the following simple consequence of Bayes conditioning: 

PrL4,|S„ A • • • A S i+1 ] < Pr[Ai\S n A • • • S i+1 A E t ] + Pr[E~\S n A • • • 

From Proposition 6.1 the sum of all "second terms" is o(l). As to the first term, 
the conditioning implies that the clauses of $i are chosen uniformly at random and 
their number is between yi and Zi. Since PUR accepts $j if and only if $j contains 
no positive literals, we have 

( i + 2)2'-l ) - p 'raSA---A5^7AEi] (6.3) 

5 '-('-Wir)"** 111 ' 



in particular 

Pr[^|S n A--- AS i+1 A.Ei] < ( 1 



(i + 2)2* 



The right hand side is less or equal than e (*+2)2«-i. Since > ^ and f/j > 
N n ■ (1 - n ^i S g^+f gi S ogn ) ^ for a sufficiently large n we have, (assuming such 

an n) e (<+a)2*-i < e (.+2)2^ < e — . 

Summing up all these upper bounds for Pr[j4j|5„ A • • • A Si + i A Ei] and observing 
the exponents as part of the progression {| • j}, we obtain the desired upper bound 



7. PUTTING IT ALL TOGETHER 



Now we complete the proof of Theorem 2.1 by proving equation (2.1). 
In order to prove this result it suffices to show that 

lim Pr $en(mi „)[PUR rejects $] = F{ e - C ). (7.1) 

n— >oo 

It is easy to see that F is well-defined on (0, 1) and has the following Taylor series 
expansion 

F(x) = (-1) 6 ° + (-l) bl x + {-Ipx^ +■■■ (-l) b *x l + ■■■ 

with bi being the number of ones in the binary representation of i. Also F is 
monotonically decreasing, positive on (0, 1), and has limit 1 at 0. 

Fix e > 0. Let R be the event "PUR rejects <f>". What we need to show is that 
for a sufficiently large n, 

(1 - e)n < Pr[R] < (1 + e)II. (7.2) 
Since II converges and II > 0, there exists some ko such that for all k > k , 

VT~e<^<(l + e). (7.3) 

By Lemma 6.6, there exist some n > and c > such that for every n > n 
and every c > Co, Pr$£SJ(n,2"c)[PUR rejects $] > \/l — e. Keeping in mind the fact 
that events A n , A n _i, • • • , A n -k+i are incompatible with R we obtain the equality 

Pt[R] = Pr[i?|^ A • • • A A~k] ■ PrpQ • '[J Pr[3~|3^ A • • • A 

l<i<k 

for every fixed k. 

Although conceptually simple, the rest of the proof is a little bit cumbersome. 
We first consider the case c > 4 In 2 (so that the upper bound in Lemma 6.6 is 
strictly less than one). 

Choose k so that, for large enough n, y n -k > c ■ 2 n ~ k . This is possible since 

y„_ fe > C . 2 «[l-^]. 

We claim (and it is in the proof of these two relations where the assumption 
c > 4 In 2 will be used) that for every j, n — k < j < n, that 

Pv[A~\A^ A • • • A ~A~^} = Pr[A~\S n A • • • A S j+ i] + o(l), (7.4) 

and 



Pr[R\A n A • • • A A J+1 ] = Pr[R\S n A • • • A S j+1 ] + o(l). 



(7.5) 



We will postpone proving these equations and will see how the theorem can be 
proven from these equations. 

From equations 6.3 and 7.4 it follows that 

1 fl 7 Y""' - (7.6) 

V {n-i + 2)2"-' - 1/ w y ' 

< Pt[A~\~A^ A • • • A A n _ i+1 ] (7.7) 
* 1 -( 1 - (n-i + 2^-l ) , ""' +0 ^ 

This proves that, for every i = 1, . . . , k, 

lim Px\A~\A^A--- Ai~~i] = (1 - e^' 2 *). 

n^oo 

In a similar vein, we have, for large enough n, 

\/T — 1 < Pr[i?K A • • • A ^„_ fe+ i] < 1. 

If we take a large enough n, since the second part is asymptotically equal to life, 
by (7.3) we have (7.2). 

For a general c > 0, define c* to be the infimum of all c's for which the rela- 
tion 7.1 holds for every c' > c. Suppose c* > 0. The single-step version of (7.5) 
provides Pt[R\A^\ = Pr[R\S n ] + o(l), so Pr[R] = Pv{A^\ Pr[R\A^\ +o(l). Let c < c* 
and let n\ be such that for all n > m, 2c(l — ^) 2 > c*. By Fact 3.1 and Propo- 
sition 6.1 we have Pr[R\S n ] = Pr[R\S n A £„_i] + o(l). Then by Fact 3.1 we have 
min Ae/ {Pr[i?|S„ A £„_i A (iV„_i = A)]} < Pr[R\S n A £„_i] < max AeJ {Pr[i?|S„ A 
£7„_i A (N n _i — A)]}. Conditioned on surviving stage n and on the value of Ni, 
$„_i is a random formula. Since both y n -i and z n -\ are asymptotically equal to 
2"c, for large n, $„_i is a random formula with n—1 variables and 2"~ 1 • (2c + o(l)) 
clauses. Thus, Hindoo Pr[i?|5„] = lim„^oo Pr[R\S n A £„_i] = F(e~ 2c ). Since 
PrfA^] is asymptotically equal to 1 - e~ c , and F(c) = (1 - e~ c )F(2c), (7.2) holds 
for c. This shows that c* = 0, hence 7.1 is true for every c > 0. 

Now what remains is to prove (7.4) and (7.5). We will prove only (7.4); proving 
the other is quite similar. Let T be the event that PUR rejects in one of the 
first k stages. Note that Pr[T] = o(l), as seen in Lemma 6.5. Note that T = 
R n V (S n A R n -i) V • • • V (S n A • • • A S n -k+2 A R n -k+i), so the probability of each 
of the k terms in the disjunction is o(l). 

Note that 

Pr(A>--- AAj] = PriA, A ■ ■ ■ AA~ A R e n " A ■ ■ ■ A Rf+I], 

r>j + l 

e r e{-l,+l} 



where A -1 denotes the opposite of the event X . All terms in the sum, other than 
Pr[A n A • • • A Aj A R^ 1 A • • • A RJ+\] are either inconsistent (the algorithm rejects 
twice) or imply one of the terms appearing in the disjunction of the decomposition 
of T. Thus, 

PrfA^ A • • • A ~Aj] — PrpU A A • • • A A j+1 A R j+1 A A~j] + o(l), 

that is, 

PrpU A • • • A ~Aj] = Pr[S n A • • • A S j+ i A A~J\ + o(l). 



Similarly, Pr[A n A • • • A A j+1 ] = Pr[S n A • • • A S j+1 ] + o(l). 

Note that for every sequence of events A n and B n with liminfn^oo Pr[£?„] > 0, 
| prjg"^"^] — pI\b"\ | = o(l). So, it suffices to show that liminfn^oo Pi[S n A • • • A 
S n -k] > 0. This probability is 1 — Pr[PUR accepts in one of the first k steps] 

— c/4 

— Pr[PUR rejects in one of the first k steps], and thus, is at least 1 — ^ c/4 — 



— c/4 

o(l) — Pr[T]. Since 1 ^ e _ c / 4 < 1, the required condition is guaranteed. 



8. PROOF OF THEOREM 2.2 



From equations (7.4) and (6.3) and Proposition 6.1 it follows that the probability 
that the algorithm accepts exactly at Stage k, given that it has not stopped before, 
tends (as n — > 00) to e~ 2 c . We have 



Pr[A„_ fe A [* E SAT ]] = PrL4„_ fe A [* e SAT ] A S n - k +i] 

= PrK_ fe A [* e SAT ]|5„_ fc+ i] • Pr[S„_ fe+ i] 

= Pr[A„_ fc |S„_ fc+ i] • Pr[5„_ fe+ i]. 

Therefore 



Pk = lim PrL4„_ fe |4> e SAT ] = lim ^d^±^ . Pr[S„_ fe+1 ] 



9. PROOF OF THEOREM 2.3 



We will only provide an outline of the proof of Theorem 2.3, sine its overall philos- 
ophy is quite similar to the one used to prove Theorem 2.1. 

Redefine, for the purpose of this section, the index k to refer to events taking 
place at stage n — \ \og 2 {n)\ — k. For instance S k is the same as the event Y n > 
n - \\og 2 (n)\ - k. 

Theorem 2.3 follows, of course, from the following claim 



Lemma 9.1. 

lim Pr[F„ > Llog 2 (n)J + k\R] - G(k, c n ) = 0. (9.1) 

n^oo 

To prove Lemma 9.1 we first show, using methods similar to the ones used to 
prove Lemma 6.6, the following result 



Lemma 9.2. 

lim liminf Pr[T„ > |_log 2 (ra)J + k\R] = 1. 

k — > — oo n — >oo 

The proof of Lemma 9.1 proceeds now by observing that 

Pr[(y„ > Uog 2 NJ +k)AR] 

= Pr[S k A R] = Pr[S k -! A ~R~k A R] 

= Pr[H fc Ai2|5fc_i]Pr[5 fc _i] 

= (Pr[E fe |5 fe _!] - o(l)) • (Pr[5 fe _! A R] + o(l)) 

= (Pr[ftfc|S fc _i A E k ] - o(l)) • (Pr[5 fc _x A R) + o(l)) 

= Pr[^ fc |5 fe _i A E k ] ■ Pr[(y„ > [\og 2 {n)\ + k - 1) A R] + o(l) 



By Lemma 6.3 the first term is approximately e c "' 2 . 

Iterating downwards for a constant number of steps, up to fco € Z, we infer 

k 

Pr[r„ > Llog 2 (n)J+fc|i?] = Pr[F„ > |k)g 2 (n)J+fcb|J2]- Pr[Rk\Sk-iAE k ]+o(l). 

j=fe +l 

Choosing ko small enough so that, by Lemma 9.2, the first term is "close enough 
to 1" and the product is "close enough to G(c„, fc)" proves relation 9.1. 



10. FURTHER DISCUSSIONS AND OPEN PROBLEMS 



There are several versions of Horn satisfiability whose phase transition is worth 
studying. One of them is the class of extended Horn formulas [CH91, SAFS95], for 
which PUR is still a valid algorithm [CH91]. On the other hand, Horn-like restric- 
tions have been employed to design tractable restrictions of various formalisms of 
interest in Artificial Intelligence, for example in constraint programming, temporal 
reasoning, spatial reasoning, etc. In many such cases positive unit resolution has 
natural analogs, (for instance arc- consistency in the case of ORD-HORN formulas 
in temporal reasoning [NB95] ) , and it would be interesting to see whether the ideas 
in this paper can inspire similar results. 

Let us also remark that, as shown in [1st], the average-case behavior of PUR 
as displayed in Theorem 2.2, is responsible for a physical property called critical 
behavior, widely studied in Statistical Mechanics and related areas (see, for instance, 
[Sla94] , for the case of percolation) , and similar to the one observed experimentally 
in [KS94] for the case of fc-SAT. 

One final issue is whether one can meaningfully define and study the existence of 
a "physical phase transition" in HORN-SAT. The major problem is a "degeneracy" 
property of our random model for Horn satisfiability: one can satisfy all but the 
positive unit clauses of any formula by the assignment 1 1 ... 1 . But under the 
random model employed in this paper the fraction of such clauses is o(l), a property 
that is not shared by any of the previously studied problems, and which makes the 
"physical interpretation" problematic. Whether the problem becomes meaningful 
under a different random model remains to be seen. 
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